ON MULLIN'S SECOND SEQUENCE OF PRIMES 

andrew r. booker 

1. Introduction 

T 1 

O '• I^ US]) Mullin constructed two sequences of prime numbers related to Euclid's proof that 

^ ' there are infinitely many primes. For the first sequence, say {pn}'^=i, we take pi = 2 and 

define Pn+i to be the smallest prime factor of 1 +pi ■ ■ -pn- The second sequence, {Pn}'^=i, is 
defined similarly, except that we replace the words "smallest prime factor" by "largest prime 
t^^ I factor" . These are sequences A000945 and A000946 in the OEIS [1\ , and the first few terms 

of each are shown below. 
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Mullin then asked whether every prime is contained in each of these sequences, and if 
not, whether they are recursive, i.e. whether there is an algorithm to decide if a given prime 
occurs or not|j Almost nothing related to this is known for the first sequence, though Shanks 
[13] has conjectured on probabilistic grounds that it contains every prime; we briefly discuss 
this conjecture and some variants in Section 2 below. Concerning the second sequence. Cox 
and van der Poorten [4j showed that, apart from the first four terms 2, 3, 7 and 43, it omits 
all the primes less than 53; it is straightforward to extend this to the remaining primes less 
than 79 by applying their method using the most recent computations of P„, due of Wagstaff 
[T4] . In response to Mullin's questions. Cox and van der Poorten conjectured that infinitely 
many primes are omitted, and that their method would always work to decide whether a 
given prime occurs; moreover, they showed that at least one of their conjectures is true. The 
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Mullin also asked whether the second sequence might be nionotonic (and hence recursive); this was 
answered negatively by Naur [llj . who was the first to compute it beyond the 9th term. However, it remains 
an open question whether there are infinitely many n such that P„ > P„+i. 
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main point of this paper is to prove the first of these conjectures. Precisely, we show the 
following. 

Theorem 1. The sequence {P„}^]^ omits infinitely many primes. If {Qn}^=i denotes the 
sequence of omitted primes in increasing order, then 

,. l0gQ„+l 1 1707 

hmsup- — — — - < -—^ — - = 0.1787.... 

n^oo log(Ql ■ ■ ■ Qn) 4:y/e - 1 

We note that although our method of proof allows us to bound each omitted prime Qn 
in terms of the previous ones, it is not constructive; in particular, MuUin's second question 
remains open (see Theorem 2 below, however). 

The number jt^ty ^^ ^^^ theorem is related to the best-known bound O i p^^ j for 

the least quadratic non-residue (mod p). This was first shown by Burgess \2l, based on an 
argument of Vinogradov; apart from refinements of the o(l), it has not been improved upon 
in over 50 years. However, if the Generalized Riemann Hypothesis for quadratic Dirichlet 
L-functions is true then one can show the much stronger bound Qn+i = 0(\og^{Qi ■ ■ -Qn)), 
from which it follows that 

#{ri ■.Qn<x}:$>-^ 

logx 

for large x. Even this seems far from the truth; indeed, it is likely that the set of primes that 
occur in {P„}^^ has density 0. While we have not been able to prove that unconditionally, by 
refining Cox and van der Poorten's argument on the relationship between their conjectures, 
we can show the following. 

Theorem 2. //{-Pnj^i is not recursive then it has logarithmic density in the primes, i.e. 



E 



p<x 

pe{Pi,P2,...} 



lim ^^'^''-'-^ = 0. 
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2. Variants 



Before embarking on the proofs of Theorems 1 and 2, we set our results in context by 
comparing to a few variants of the sequence {Pn}^=i- 

(1) As mentioned above, very little is known about Mullin's first sequence {pn}'^=i- 
Shanks reasoned that as n increases, the numbers t„ = Pi ■ ■ ■ Pn should vary ran- 
domly among the invertible residues classes (mod p) for any fixed prime p, until p 
occurs in the sequence, after which point t„ = (mod p). If p does not occur then 
this is violated, since t„ is always invertible (mod p) but falls into the residue class 
of —1 at most finitely many times. As no one has found any reason to suggest that 
tn does not vary randomly (mod p), this is certainly compelling. However, there is 
reason to tread cautiously, first because Kurokawa and Satoh [7] have shown that an 
analogue of this conjecture for the Euclidean domains ¥p[x] is false in general, and 
second because of what happens in the next variant that we consider. 

(2) In the second variant, instead of just introducing one new prime at each step, we add 
in all prime divisors of 1 plus the product of the previously constructed primes. In 
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symbols, we set 5*0 = and define S'„ recursively by 
Sn+i = S'„ U < p : p prime and p 




This is related to Sylvester's sequence {sn}'^=i, defined by Sn = 1 + 11^=1 ■^«' '^^ 
equivalently, sq = 2, s„+i = 1 + s„(sn — 1). More precisely, there is empirical 
evidence to suggest that Sn is always squarefree, and if that is the case then 

n — 1 
p£S„ i=0 

In particular, each prime that we construct this way divides some Sylvester number. 
One could try applying the same sort reasoning as in Shanks' conjecture for this 
sequence, but it turns out that there is a conspiracy preventing this from working, 
since s„ can be described by a one-step recurrence. In fact, Odoni [12] showed 
that the set of primes dividing a Sylvester number has density 0. Thus, perhaps 
counterintuitively, the greedy algorithm of adding in all prime divisors likely yields a 
very thin subset of the primes. 
(3) Pomerance considered the following variant (unpublished, but see [5], §1.1.3]). Let 
ri = 2, and define r„+i recursively to be the smallest prime number which is not one 
of ri, . . . , r„ and divides a number of the form d + 1, where (i|ri ■ ■ ■ r„. This is in 
some sense even greedier than the previous variant, but the fact that we can choose 
proper divisors d oi ri ■ ■ ■ Vn prevents the numbers from growing out of control. Thus, 
Pomerance showed that every prime does indeed occur in this sequence, and in fact 
r„ is just the nth prime number for n > 5. 

3. Proofs 

We begin by reviewing the method of [1]. For a positive integer n, suppose that 1+Pi ■ ■ ■ Pn 
has the factorization 

(*) l + Pi---P„ = gf^---g,^% 

where qi < . . . < qr are prime and qr = Pn+i- Observe that the left-hand side is = 3 (mod 4), 
so that 

Ql J \Qr J 

where (|) denotes the Kronecker symbol. Similarly, if c? is a fundamental discriminant 
dividing Pi ■ • ■ P„ then the left-hand side is = 1 (mod d), so that 

dV' fd^'''- 



1. 

Cox and van der Poorten considered values of d for which \d\ is one of the known Pj, thus 
obtaining a system of equations which they attempted to solve by linear algebra over F2. As 
more of the Pj become known, one adds more and more constraints that must be satisfied 
by the small primes q which have not yet occurred, and one can hope eventually to reach an 
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inconsistent system. There is no known reason to believe that the equations for the various 
Pi are related, and this motivates their conjectures. 

An equivalent formulation of their method is to look for a fundamental discriminant d 
composed of known Pj such that (-) = (— ) for the first several primes q which are not 
known to occur. This is the approach that we will take, as outlined in the following lemmas. 

Lemma 1. Let x {mod q) he a non-principal quadratic character, not necessarily primitive. 
Then there is a prime number n ^^ g-*^ such that x(^) = — 1- 

Proof. Let n be the smallest positive integer such that xi^^) = ~1- It is clear that n must 
be prime, so it suffices to prove the upper bound. This is essentially a special case of [H 
Theorem 1], except for the technical point that q need not be cubefree. 

To circumvent that, we factor x = XoXi where Xo (mod go) is trivial and Xi (mod gi) is 
a primitive quadratic character. Note that if we replace go by gg = Y[p\go P ^^^ Xo by the 

ptei 
trivial character Xo (mod q'^), then x' = XoXi satisfies x'("^) = x("^) for every m. Thus, we 
may assume without loss of generality that go is squarefree and (go, gi) = 1- 

Moreover, ±gi is a fundamental discriminant, so in fact g = gogi is cubefree except possibly 
for a factor of 8. Even if 8|g, one can see that Burgess' bounds [21 Theorem 2], on which [SI 
Theorem 1] is based, continue to hold at the expense of a worse implied constant. (See [SI 
(12.56)] for a precise statement of this type.) The result follows. D 

Lemma 2. Let gi, . . . , g^ be pairwise relatively prime positive integers. For each i = 1, . . . ,r, 
let Xi {mod g^) be a non-principal quadratic character, not necessarily primitive, and let 
ej G {±1}- Then there is a squarefree positive integer n with at most r prime factors, each 

^e (Q'i ■ ■ 'Qr)^^ ^ , such that Xi{^) = ^i for all i = 1, . . . ,r. 

Proof. Let tpi be the principal character mod g, for i = 1, . . . , r, and set g = gi ■ ■ ■ gr- For 
each non-empty subset S'c{l,...,r}we define a character xs (mod g) by 

r 

Xs{n) = Yl 

Note that xs must be non-trivial since the g^ are pairwise relatively prime. By Lemma 1, 
there is a prime ns <^£ g*^ such that Xs{ns) = —1- Further, we associate to S two 
vectors in F^. The first is the characteristic vector vs = (cti, . . . , a^), defined by 

_Jl iiteS, 

The second is the unique vector ws = {bi, . . . , h^) such that Xii^is) = (~1)''' for i = 1, . . . , r. 
These vectors have scalar product vs ■ ws = ^ since Xsijis) = — 1- 

We claim that [ws : 7^ S* C {1, . . . , r}} spans F2. If not then there would be a non-zero 
linear functional which vanishes at each such tu^, i.e. a non-zero f G F2 with v ■ ws = for 
all S* 7^ 0. However, this is impossible since the vs exhaust all non-zero vectors in Fg. 

Therefore, there is a set T of non-empty subsets of {1, ... , r} such that {ws : S* G T} is a 
basis for Fg. It follows that the numbers ns for S & T are distinct primes, and as n ranges 
over the divisors of Y[seT''^s, {xii^i), ■ ■ ■ , Xr{n)) ranges over all elements of {±1}^. □ 
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Proof of Theorem 1. Let Qi, . . . ,Qr be the first r omitted primes. (We allow r = to start 
the argument, with the understanding that Qi- ■ ■ Qr = 1 in that case.) Suppose that all 
other primes up to some number x > 3 eventually occur, and let p = Pn+i < x be the last 
to occur. Then except for Qi, . . . , Qr, all primes below p must occur before p, so (Ej) takes 
the form 

l + P,---Pr, = Ql'---Q';^-p'' 
for some k,ki, . . . ,kr E Z>o. Now, applying Lemma 2 with the characters 

-4^ 



- and I ^--^ \ n 

Pj \Ql/ \Qr 

we can find a squarefree positive integer d = 1 (mod 4) such that 

'd\ /-4\ /rf\ /-4^ 



Pj \ P J ' \Qi) \Qi 

and with all prime factors of d bounded by O^ i [pQi ■ ■ -Qr)*^ "^ ) • Since p < x and -^ < 1, 
this bound must fall below x for large enough x, and in fact it is not hard to see that there 
is such an x <^s (Qi ■ ■ ■ Qr)*^~^ ^ ■ This is a contradiction, and thus there must be another 
omitted prime Qr+i ^e {Qi ■ ■ ■Qr)'^'^'^ ■ D 

The proof of Theorem 2 is based on the following generalization of the method of Cox 
and van der Poorten. For each i = 1, 2, . . ., let g^ be the smallest positive primitive root 
(mod P^), and let k : {Z/PfLY -^ 'L/Pi{Pi - 1)Z be the hase-gi logarithm. Suppose 
that we have computed Pi, . . . ,Pn- Note that if n > N then for any i < N, the left- 
hand side of Q is = 1 (mod Pj) but ^ 1 (mod P^) since the P's are distinct. Thus, 
kili{qi) + . . . + krli{qr) = (mod Pi — 1), but is non-zero (mod Pj). In other words, there is 
a vector bi G Fp, such that bi ■ (fci, . . . ,kr) ^ E ¥p.. On the other hand, we can construct 
other constraints (mod Pj) by considering ([*]) modulo any Pj for which Pj = 1 (mod Pj) (if 
there are any). If Pj is such a prime then kilj{qi) + . . . + krlj{qr) = (mod Pj), i.e. there is 
a vector Vij G Fp, such that Vij ■ {ki, . . . , kr) = E Fp-. 

Thus, we can try to prove that qr is omitted by finding a linear combination of the Vij 
which yields 6j. For i = 1, this is equivalent to Cox and van der Poorten's method. If that 
fails to exclude g^ then we can try i = 2, and so on. Note that from a practical standpoint, 
one will accumulate equations modulo Pi = 2 far more quickly than for the other primes. 
Thus, the greatest chance of success is with i = 1, so this is unlikely to yield any improvement 
over their method in practice. However, as our proof will show, the other primes become 
useful if there is a conspiracy which makes their method fail. 

Lemma 3. Let n be a squarefree positive integer, q an integer which is relatively prime 
to n and not a perfect pth power for any prime p\n, and d a divisor of n. Then the field 
L = Q{\/^, e^'^ ) is normal over Q and has degree [L : Q] = dip(n). Further, a rational 
prime p not dividing the discriminant of L splits completely in L if and only if p = 1 {mod n) 
and 3x G Z such that x"^ = q {mod p). 

Proof (adapted from [9j, Lemmas 3.1 and 3.2). First note that L is the splitting field of (x"' — 
g)(a;" — 1), so it is normal over Q. Set Cn = e^'"*/"', and let K = Q{Cn) be the corresponding 
cyclotomic field. Then K has degree ip{n) over Q, so to establish the formula for [L : Q] = 
[L : K][K : Q], it suffices to show that x'^ — g is irreducible over K. 
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To that end, we first show that ^/g ^ K for any prime divisor p\d. If p is odd then 
Q(^/g) C M is not normal over Q since it has non-real conjugates. On the other hand, every 
subfield of K is normal over Q since K is an abelian extension, and thus ^/g ^ K. This 
argument fails if j9 = 2, but in that case it follows from class field theory that the quadratic 
subfields of K are exactly those of the form Q(v^) for fundamental discriminants D\n. 
Since (g, n) = 1, Q(y^) is not among them, so the claim still holds. 

Next, suppose that / G K[x\ is a monic irreducible factor of x'^ — q, of degree d' < d. Note 
that over L we have the factorization 

d 

where Cd = C" is a primitive dih. root of unity. Thus, the constant term of / must take the 
form (—1)'^ CnQ'^^'^ for some integer k. Hence g*^/^ G K^ and by the Euclidean algorithm we 
can improve this to g*^*^ ''^'^/'^ G K. However, since 7^ rf' < d, there is a prime p\ (J'^y This 

implies that ^/g G K, in contradiction to the above, and thus x'^ — g is irreducible over K, 
as claimed. 

For the final statement, it is well-known that a rational prime p splits completely in 
K = Q(Cn) if and only ii p = 1 (mod n), and this is a necessary condition for p to split 
completely in L D K. li p = 1 (mod n), let p be any of the ip{n) primes of K dividing pOx, 
where Ok is the ring of integers of K. If p does not divide the discriminant of L then p splits 
completely in L if and only if a;'^ — g has d roots in the residue field Ok/P — ^p, which in 
turn happens if and only if g has a dth root (mod p). D 

Lemma 4. Let m be a squarefree positive integer and q an integer which is relatively prime 
to m and not a perfect pth power for any prime p\m. Then the set of primes p for which 
X™ = g (mod p) is solvable has natural density ^£^^, 

Proof. Note that the number of solutions of x™ = g (mod p) is the same as that of x*-™'^^"^^ = 
g (mod p). For large y > 0, we thus want to estimate the fraction 

if a;(™'P~i) = g (mod p) is solvable, 
otherwise 

■^^ 1 ■^^ J 1 if a;"^ = g (mod p) is solvable, 
, ^(y) ^ 1 otherwise 

d\m p<y K. 

{m,p—l)=d 

ii x'^ = q (mod p) is solvable. 




= } } fi(e)——- y , 

n ^ T^yy) ^ otherwise 

d\ra e|^ ^ ' p<y \ 

p=l (mod de) 

•sr-^ sr-^ fn\ 1 ■^^ J 1 if x"^ = g (mod p) is solvable, 

^—^^—^ \dJ Triy) ^ lo otherwise. 

n\m, d\n p<y K 

p=l (mod n) 

By Lemma 3 and the Chebotarev Density Theorem, the inner sum over p divided by 7r{y) 
tends to ^ I-, as y — )■ 00. (Note that the earlier Kronecker-Frobenius Density Theorem 
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would be enough here if we instead considered the logarithmic density.) Thus, the set we 
are interested in has density 



difin) Z-^ ^M d Z-^ n m 

n\m d\n n\in d\n n\in 



n 



Proof of Theorem 2. Since {Pj}'jLi is recursively enumerable, the only way that it can fail 
to be recursive is if there is some Qr for which there is no algorithm to prove that it does not 
occur among the Pj. In particular, the general strategy described above must fail to exclude 
Qr, no matter how large we take A^. 

Note that for large enough A^, Q will take the form 

for n > N. For i = 1,. . . ,N, let bi,Vij G Fp. be as described above. Although we have 
restricted to i < A^, we are free to consider arbitrarily large values of j in this construction 
by taking n > j in (j*]), so for each i there are potentially infinitely many suitable j. In order to 
avoid eventually concluding that Qr is omitted, 6, must not be a linear combination of the Vij] 
in particular, the Vij span a proper subspace of Fp,, so there is a non-zero vector Wi G Fp. such 
that Vij -Wj = for every j such that Pj = 1 (mod Pj). By the Chinese Remainder Theorem, 
there are non-negative integers ai, . . . , a^ < Pi ■ ■ ■ Pat such that (oi, . . . , a^) = Wi (mod Pj) 
for 2 = 1, . . . , A^. Set q = Q^^ ■ ■ ■ Qr'' ■ Then by construction, q is not a perfect Pjth power 
for any i < N, but it is a Pjth power residue (mod Pj) for all j such that Pj = 1 (mod Pj). 
Note also that q is automatically a Pjth power residue (mod Pj) if Pj ^ 1 (mod Pj). 

It follows that the entire sequence {Pj : j = 1,2, . . .} is a subset of the primes modulo 
which q is an mth power residue, where m = Pi ■ ■ ■ P/y. By Lemma 4, that set has density 

m J-l I p 

Taking A^ arbitrarily large, we have 

1=1 

with the understanding that the right-hand side is if the product diverges. In that case, 
{Pj}°^i has natural density 0, which in turn implies that the logarithmic density is 0. On 
the other hand, if the product converges then so does Ylili p"' "^^lich also implies that the 
logarithmic density is 0. 

Finally, we remark that while it does not necessarily follow that {Pj}^^ has a natural 
density, the last inequality shows that its upper density is strictly less than 1; in fact, using 
just the values in Table 1, we see that the upper density is at most 0.277056. D 
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